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Let f Uj h be a kernel density estimator of a continuous and bounded 
(i-dimensional density /. Let ip(t) be a positive continuous function 
such that 1| oo < oo for some < j3 < 1/2. We are interested in 
the rate of consistency of such estimators with respect to the weighted 
sup-norm determined by ip. This problem has been considered by 
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1 Introduction 



Let X, Xi,X 2 , ... be i.i.d. Unvalued random vectors and assume that the 
common distribution of these random vectors has a bounded Lebesgue den- 
sity function, which we shall denote by /. A kernel K will be any measurable 
positive function which satisfies the following conditions: 

(K.i) [ K{s)ds = 1, 

J~IR d 

(K.ii) Halloo : = sup — k < oo. 

xen d 

The kernel density estimator of / based upon the sample X x , . . . ,X n and 
bandwidth < h < 1 is defined as follows, 

1 n 

i=i 

Choosing a suitable bandwidth sequence h n — > and assuming that the den- 
sity / is continuous, one obtains a strongly consistent estimator f n := f n ,h n 
of /, i.e. one has with probability 1, f n (t) — > f(t),t G IR d . There are also 
results concerning uniform convergence and convergence rates. For proving 
such results one usually writes the difference f n {t) — f(t) as the sum of a 
probabilistic term f n (x) — JEf n (t) and a deterministic term JEf n (t) — f(t), 
the so-called bias. The order of the bias depends on smoothness properties of 
/ only, whereas the first (random) term can be studied via empirical process 
techniques as has been pointed out by Stute and Pollard (see [10-13]), among 
other authors. 



After the work of Talagrand [14], who established optimal exponential 
inequalities for empirical processes, there has been some renewed interest in 
these problems. Einmahl and Mason [3] looked at a large class of kernel 
type estimators including density and regression function estimators and de- 
termined the precise order of uniform convergence of the probabilistic term 
over compact subsets. Gine and Guillou [5] (see also Deheuvels [1]) showed 
that if K is a "regular" kernel, the density function / is bounded and h n 
satisfies among others the regularity conditions 

log(l//i n ) nh n 

- — > oo and > oo, 

log log n log n 
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one has with probability 1, 



ll/n-E/ B |U = Oh/i^|. (1.1) 



'"n 



Moreover, this rate cannot be improved. 

Recently, Gine, Koltchinskii and Zinn (see [8]) obtained refinements of 
these results by establishing the same convergence rate for density estima- 
tors with respect to weighted sup-norms. Under additional assumptions on 
the bandwidth sequence and the density function, they provided necessary 
and sufficient conditions for stochastic and almost sure boundedness for the 
quantity 

" // " sup /.•!/){./•(/) IK/,.!/ij . 



log h n \ 



teiR a 

Results of this type can be very useful when estimating integral functionals of 
the density / (see for example Mason [9]). Suppose for instance that we want 
to estimate f nd (f>(f(t))dt < oo where : 1R — > 1R is a measurable function. 
Then a possible estimator would be given by J Rd 4>(f n ,h{t))dt. Assuming that 
is Lipschitz and that fP(t)dt =: eg < oo for some < /3 < 1/2, one 
can conclude that for some constant D > 0, 

< D Cfi sup \f- p {t){f n , h (t)-^f n , h (t)}\, 
teiR d 



<j)(fn,h(t))dt - / <f>(Mf n , h (t))dt 



and we see that this term is of order y/\ log h\/nh. For some further related 
results, see also Gine, Koltchinskii and Sakhanenko [6,7]. 



In practical applications the statistician has to look at the bias as well. 
It is well known that if one chooses small bandwidth sequences, the bias will 
be small whereas the probabilistic term which is of order 0{yf\ logh n \/nh n ), 
might be too large. On the other hand, choosing a large bandwidth sequence 
will increase the bias. So the statistician has to balance both terms and 
typically, one obtains bandwidth sequences which depend on some quantity 
involving the unknown distribution. Replacing this quantity by a suitable 
estimator, one ends up with a bandwidth sequence depending on the data 
Xi, . . . , X n and, in some cases, also on the location x. There are many elabo- 
rate schemes available in the statistical literature for finding such bandwidth 
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sequences. We refer the interested reader to the article by Deheuvels and 
Mason [2] (especially Sections 2.3 and 2.4) and the references therein. Un- 
fortunately, one can no longer investigate the behavior of such estimators via 
the aforementioned results, since they are dealing with density estimators 
based on deterministic bandwidth sequences. 

To overcome this difficulty, Einmahl and Mason [4] introduced a method 
allowing them to obtain "uniform in h" versions of some of their earlier re- 
sults as well as of (jl.lj) . These results are immediately applicable for proving 
uniform consistency of kernel-type estimators when the bandwidth h is a 
function of the location x or the data X\, . . . , X n . 

It is natural then to ask whether one can also obtain such "uniform in 
h" versions of some of the results by Gine, Koltchinskii and Zinn [8]. We 
will answer this in the affirmative by using a method which is based on a 
combination of some of their ideas with those of Einmahl and Mason [4] . 

In order to formulate our results, let us first specify what we mean by 
a "regular" kernel K. First of all, we will assume throughout that K is 
compactly supported. Rescaling K if necessary, we can assume that its 
support is contained in [—1/2, l/2] d . Next consider the class of functions 

K, = {K{{- - t)/h}/ d ) : h > 0, t G IR d } . 

For e > 0, let M(e,)C) = sup Q Af(ne, JC, do), where the supremum is taken 
over all probability measures Q on (JR d ,B), (Iq is the L 2 (Q)-metric and, as 
usual, ftf(e, JC, do) is the minimal number of balls {g : dq(g, g') < e} of g?q - 
radius e needed to cover JC. We assume that K, satisfies the following uniform 
entropy condition: 

(K.iii) for some C > and v > : W(e, fC) < CC V ', < e < 1. 

Van der Vaart and Wellner [15] provide a number of sufficient conditions 
for (K.iii) to hold. For instance, it is satisfied for general d > 1, whenever 
K(x) = <f)(p(x)), with p (x) being a polynomial in d variables and <fi a real 
valued function of bounded variation. Refer also to condition (K) in [8]. 

Finally, to avoid using outer probability measures in all of our statements, 
we impose the following measurability assumption: 

(K.iv) fC is a pointwise measurable class. 
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With "pointwise measurable" , we mean that there exists a countable subclass 
/Co C JC such that we can find for any function g G /C a sequence of functions 
g m G /Co f° r which g m (z) — > g(z),z G H d . This condition is discussed in 
van der Vaart and Wellner [15] and in particular it is satisfied whenever K 
is right continuous. The following assumptions were introduced by Gine, 
Koltchinskii and Zinn [8]. Note that we need slightly less regularity since we 
will not determine the precise limiting constant or limiting distribution. In 
the following we will denote the sup-norm on H d by | • |. 

Assumptions on the density. Let Bf := {t G ~fR d : f(t) > 0} be the 
positivity set of /, and assume that Bf is open and that the density / is 
bounded and continuous on Bf. Further, assume that 

(D.i) V 5 > 0, 3 ho > and < c < oo such that V x,x + y G Bf, 

c~ l f l+s {x) < f(x + y)< cf 1 - 5 ^), \y\ < h , 

(DM) Vr > 0, set F r (h) := {(x,y) : x + y G B f ,f(x) > K\ \y\ < h}, then 

lim sup 

(x,y)€F r (h) 

Assumptions on the weight function ip. 

(W.i) ijj : Bf — > 1R + is positive and continuous, 

(W.ii) V 5 > 0, 3 ho > and < c < oo such that V x,x + y G Bf and 
c -y- 5 (z) < il>{x + y)< c^ l+s (x), \y\ < h , 

{W.iii) 

Vr > 0, set G r (h) := {(x,y) : x + y G Bf,ip(x) < h~ r , \y\ < h}, then 
lim sup 

(x,y)eG r (h) 



f{x + y) 



- 1 



0. 



ijj(x) 
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Extra assumptions. For < (3 < 1/2, assume that 

{WD. i) ll/^lloo=sup|/^(t)| <oo, 

teB f 

(WD. ii) Vr>0, lim sup 

(x,y)eG r {h) 

A possible choice for the weight function would be ip = f~ fj in which case 
the last assumptions follow from the corresponding one involving the density. 
For some discussion of these conditions and examples, see page 2573 of Gine, 
Koltchinskii and Zinn [8]. 



f(x + y) 
/(*) 



Now, consider two decreasing functions 

a t := a{t) =r a L 1 {t) and b t := b(t) := r^L 2 (t), t > 0, 

where < /i < a < 1 and L\, L 2 are slowly varying functions. Further define 
the functions 



X(t) := \Jta t \ logoff, t > 0, 

^n(h) ■= a/ nh\ log /i | , n > 1, a n < h < b n , 

and it is easy to see that the function A is regularly varying at infinity with 
positive exponent < r\ := < 1/2 for some < 6 < 1. Finally, we 
assume that \(t) is strictly increasing (t > 0). 

Theorem 1.1 Assume that the above hypotheses are satisfied for some < 
(3 < 1/2, and that we additionally have 

lim sup W {ip(X) > X(t)} < oo. (1.2) 

t— >oo 

Then it follows that 



nh 

A n := sup a I j- — \\ijj(f n ,h ~ E/n,h)|loo 

a „<h<b n y I log ft I 

zs stochastically bounded. 

Note that if we choose a n = b n = h n we re-obtain the first part of Theo- 
rem 2.1 in Gine, Koltchinskii and Zinn [8]. They have shown that assumption 
(jl.2|) is necessary for this part of their result if Bf = ]R d or K(0) = k. There- 
fore this assumption is also necessary for our Theorem 11.11 



6 



Remark. Choosing the estimator f n ,h n where h n = H n (Xi, . . . , X n ; x) G 
[a n , b n ] is a general bandwidth sequence (possibly depending on x and the 
observations X%, . . . , X n ) one obtains that 

\\i>{fn,h n ~ E/»,aJ||oo = O p (v1 loga n |/na n ). (1.3) 

Indeed, due to the monotonicity of the function h — > nh/\ log/i|,0 < /i < 1 
we can infer from the stochastic boundedness of A n that for all e > and 
large enough n, there is a finite constant C e such that 



P< SUP U(f n , h - JEfn, h )\L > C e J^^ } < C 

a n <h<b n y na n 

which in turn trivially implies ()1.3|) . Note that this is exactly the same 
stochastic order as for the estimator f n ^ n where one uses the deterministic 
bandwidth sequence a n . 

Theorem 1.2 Assume that the above hypotheses are satisfied for some < 
f3 < 1/2, and that we additionally have 

/oo 
P{V(X) > X(t)}dt < oo. (1.4) 

Then we have with probability one, 



limsup sup W i lQgfe | IM/n.ft-E/n./OIL < C, (1.5) 



n^oo a„<h<b„ 



where C is a finite constant. 

Remark. If we consider the special case a n = b n , and if we use the de- 
terministic bandwidth sequence h n = a n , we obtain from the almost sure 
finiteness of A n that for the kernel density estimator /„ = f n ,h n , with prob- 
ability one, 

r |M/n-E/n)l|oo ^ < 

hm sup — -j^^^=^^^^ < C < oo. 

rwoo y/nh n /\ l0g/i n | 

Moreover we can apply Proposition 2.6 of Gine, Koltchinskii and Zinn [8], 
and hence the latter implies assumption (jl.4j) to be necessary for (jl.5|) if 
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B f = JR d or K(0) > 0. 

Furthermore, with the same reasoning as in the previous remark following 
the stochastic boundedness result, Theorem 1.2 applied to density estimators 
fn,h n with general (stochastic) bandwidth sequences h n = H n \X\, . . . , X n \ x) G 
[a n , 6 n ] leads to the same almost sure order 0(y/\ log a n \/na n ) as the one one 
would obtain by choosing a deterministic bandwidth sequence h n = a n . 

We shall prove Theorem 11.11 in Section 2 and the proof of Theorem 11.21 
will be given in Section 3. In both cases we will bound A n by a sum of several 
terms and we show already in Section 2 that most of these terms are almost 
surely bounded. To do that, we have to bound certain binomial probabil- 
ities, and use an empirical process representation of kernel estimators. So 
essentially, there will be only one term left for which we still have to prove 
almost sure boundedness, which will require the stronger assumption (jl.4j) 
in Theorem 1.2. 

2 Proof of Theorem 1.1 

Throughout this whole section we will assume that the general assumptions 
specified in Section 1 as well as condition Kl.ty are satisfied. Moreover, we 
will assume without loss of generality that H/^Vlloo < 1- 



Recall that we have for any t G Bf and a n < h < b n , 



-^m{fn, h (t)-Mfn,h(t)} 

We first show that the last term with the expectation can be ignored for 
certain £'s. To that end we need the following lemma. 

Lemma 2.1 For a n < h < b n and for large enough n, we have for allt G Bf, 



mb(t)^ T fX-t\ nh n/Sl/s 



X n (h) V h 1 ^ ) ~ V | log /i | 

where 7 n — > 0. 
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Proof. For any r > 0, we can split the centering term as follows in two 
parts: 



Xn(h) 



Knhip(t) 

Xn(h) u|<l/2 

t+uh 1 / d eB f 



< snp f(t + uh^)I {fmhr} 



inhibit) ,. ,!,,;., 

+ X /m SU P /(* + Uh/ )kf(t)>hr} 

A-nyil) \u\<i/2 

t+uh 1 / d GB f 

j n (t,h) +£ n (t,h). 



Now take < 5 < 1 — (3 and choose r > such that 

sup -^0. (2.2) 

0„<ft<6 n \nh) l \ n {h) 

Note that such a r > exists, since the denominator does not converge faster 
to zero than a negative power of n, as does h G [a n , b n ]. We now study both 
terms £ n (i, /i) and 7 n (i, /i) for the choice r = r. For 5 > chosen as above, 
there are h > 0, c < oo such that for x, x + y G 5/ with \y\ < h , 

c- 1 f 1+6 (x)<f(x + y)<cf 1 - s (x). (2.3) 

Moreover, for the choice of r > we obtain by condition (D.ii) that for all 
h small enough and x G Bf with f(x) > h T , 

f(x + y)<2f(x), \y\<h l ' d . (2.4) 

Therefore, in view of (|2.4|) and recalling the definition of X n (h), we get for 
t G !R d that 

/ nh 

Mt,h)<2KJj^f(t)il>(t). (2.5) 

Finally, using condition (WD.i) in combination with (|2.2jl and (|2.Hjl . it's easy 
to show that 

sup sup 7„(t, /i) =: 7„ — > 0, 

tg]R, d a n <h<b n 

finishing the proof of the lemma. □ 
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To simplify notation we set 



nh 

A n := SUp il- \\lp(f n , h - Wn.fe) 

a n <h<b n V | l0g/ij 

and set for any function g : IR d — > IR and C C IR d , ||g||c := su Ptec l#(^)l- 
We start by showing that choosing a suitable r > it will be sufficient to 
consider the above supremum only over the region 

A n := {teB f : i{)(t) < b- r } C U d . (2.6) 

Lemma 2.2 There exists an r > swc/i £/m£ with probability one, 



nh 

SUP, \/ | lQg/l | ll^(/n,fe - Wn,fc)ll]R<V» ' ' 



a n </i<b ?l 



Proof. Choose r > sufficiently large so that, eventually, b r n < n 2 . Note 
that ■?/>(£) > implies that f(t) < bl/ 13 , and consequently we get that 
f{t)^{t) < fit) 1 -? < bn il/l3 ~ 1] , such that for (3 < 1/2 this last term is 
bounded above by n~ 2 for large n. Recalling Lemma 12.11 we can conclude 
that 

/ Tih 

sup t/TwfcTl^ E -f n ' h ll»V» — * ' 
and it remains to be shown that with probability one, 



a n <h<b n 

It is obvious that 



Y n := Slip \I^^Mfn,h\U*\A« 



P{K„^0}<5i;P{d(X i ,^)<6 n } > 



where as usual <i(a;, A) = inf y6 A |^ — y|,a; G IR d . Then, since ijj(s) > b n r 
implies by (W.ii) that ip(t) > c^bn^ 1 for n large enough, \s — 1\ < b n and 
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5 > 0, due to our choice of r, it is possible to find a small 5 > such that, 
eventually, ip{t) > A(n 3 ). Hence, it follows using (jl.2|) that 



which via Borel-Cantelli implies that with probability one, Y n = eventually. 
□ 

We now study the remaining part of the process A n , that is 



We will handle the uniformity in bandwidth over the region A n by considering 
smaller intervals [h n j, h n j + \], where we set 



The following lemma shows that a finite number of such intervals is enough 
to cover [a n , b n ]. 

Lemma 2.3 // l n := max{j : h n j < 2b n }, then for n large enough, l n < 
21ogn and [a n ,b n ] C [h n>0 , h n> i n ). 

Proof. Suppose l n > 21ogn, then there is a j > 21ogn such that h n j < 
2b n , and hence this j satisfies 4 logn n~ a L 1 (n) < h n j Q < 2n~ fJ, L 2 (n). Conse- 
quently, we must have n < 2n a ~ tJ- L 2 (n) / 'Li(n), which for large n is impossible 
given that L 2 /Li is slowly varying at infinity. The second part of the lemma 
follows immediately after noticing that h nt0 = a n and b n < h n ^ n . □ 

For each j > 0, split A n into the regions 



^{Y n ^ 0} < nF{ij(X) > \(n 3 )} = 0(n~ 2 ), 




h 



2 J a. 



n > 1, J > 0. 
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where we take e n = (log n) _1 ,n > 2. Note that if fi/j > L, by condition 
(WD.i),ip < L^f 3 ^ 1 ^^, implying that for all j > 0, the union of A\- and 
AjL- equals A n . With (|2.1|) in mind, set for < j < /„ — 1 and i — 1, 2 



$ (i) -=sup sup -^-Vk(^V^| 
^•:=sup sup 



+i "»V'V i=1 

n-0(t) 

„ • -ap sup -— — 1EA . . 

J teA* ■ h^<h<h nJ+1 K{h) V h 1 ^ 

In particular, we have 

A « < + ^«). f = x 2 

and from Lemma \2. II and the definition of it follows that we can ignore 
the centering term ^^j- Hence, we get that 

A; < f 5 n + max J V max A$, (2.7) 

with <5 n — > 0, and we will prove stochastic boundedness of A^ by showing it 
for both maxtKj^-i and max < i < in _i A^-. Therefore, set 



K,j ■= K{h n ,j) = V2jyjna n \ log 2-?a„,|, j > 0, 

and note that X n j > A(n2- ? ). Let's start with the first term, We clearly 
have for < j < /„ — 1 that 

*S < k sup £ J W " «l < O =: 

'Vi i= i 

For = 1, . . . ,n, set -B n ,j,fc := ^ : \-%k ~ A < ^nj}; then it easily 
follows that 



Kk<n teBndik A n j 



A nJ = max jup ^ £ J {l^ - *l ^ O 
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Recall from that < K < Kj on A n for < j < l n - 1. Then it 
follows from conditions (Wlm) and (W-D.ii) that there is a p small such that 
(1 - p)m < < (1 + and /(s) < (1 + p)/(t) if \s - 1\ < hHj. In 



this way we obtain for t G ij,^, |s — 1\ < h^j and large enough n that for a 
positive constant Ci > 1, 



^)<Ci^W and f( s )^ s )< Cl e^\l^^ 



Hence, we can conclude that 



K, 3 < Ci max ^ £ - X k \ < 2h 1 Jj}I{X k G A^}, (2-8) 



"i7 ■ 1 



where A^- := {t : f(t)ip(t) < C x e\ log /i„ J+ i|/n/i nj+ i}, and it follows 
that 



max A n j < Gi max 



o<i</ n -i ' J i<fe<n A(n) 



+ Ci max max t^l Mn>j>k I{X k G A^ .}, (2.9) 

0<J<i n — 1 l<fc<n A n> j 

where M nJjfe := E"=i J {I X * ~ < 2/l nj) ~ L Note that the first term 
is stochastically bounded by assumption (jl.2|) . Thus in order to show that 

max < 3 -<i„_i is stochastically bounded, it is enough to show that this is 
also the case for the second term in ()2.9|) . As a matter of fact, it follows from 
the following lemma that this term converges to zero in probability. 



Lemma 2.4 We have for 1 < k < n and e > 0, 

max F{i>(X k )M nJjk I{X k G A^} > e\ nJ } = 0{n 



-1-7? \ 



where rj > is a constant depending on a and (3 only. 

Proof. Given X k = t, M n j jk has a Binomial(ri — l,7r nj (t)) distribution, 
where vr nj (t) := P{|X — t\ < 2h]/j}. Furthermore, since for large enough 
n, ip{t) < Cib~ r < b^^ 1 on A n , it follows for c > 1 and large n that 
f(s)/f(t) <c,\s-t\< b]! d , so that 

7r n>i (t) < 4 d ch n J(t). 
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Using the fact that the moment-generating function IE exp(sZ) of a Binomial(n, p)- 
variable Z is bounded above by exp(npe s ), we can conclude that for t G A\- 
and any s > 0, 

p nJ (t) := TP{^(X k )M nJ , k >eX n jX k = t} 
< exp (cA d nh n J(t)e s - €sXn>j 

< 



m 

exp ^(C 2 e^ 'e'-«)), s > 0, t G A 1 ^. 



Choosing s = log(l/e n )/2 = loglogn/2, we obtain for some n (which is 
independent of j) that 

Pnj(t) < exp ^_ eA ^g|°6 n j ) n > no , t G A^. 

Setting B n j := {£ G A*- : ^(t) < X n j/ logn}, it's obvious that for any fj > 0, 

max sup p n j(t) = 0(n~ v ). (2-10) 

Next, set (7 n j := A^ ABnj = {t G A^- : A n j/logn < V^)}' then using once 
more the fact that ip < / _/3 , we have that ipf < (logn / 'X n j) 1+e on this set, 
where 9 = — 2 > 0. By Markov's inequality, we then have for t G C n j, 

P«j(t) < ^ce^nKjfit)^)/)^ 

< ^ce-^lognJ^A-yilog^l 



logn 



0/2 



< 4 c'e~ I —2— , tGC n , r (2.11) 



Further, note that by regular variation, A nj /logn > A[ n (i ogn )— y],j for some 
7 > 0. Therefore, we have from (jl.2j) that 

W{ip{X k ) > X n J logn} = ((logn) 7n) , fc = l,...,n. 
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Combining this with (|2.1(J|) and (|2.11|) . we find that 

max F{^(X k )M n>j!k I{X k G A x nj } > e\ nd } 



max <( / Pn,j(t)f(t)dt + J p nJ (t)f(t)dt 



< O(n^) + O ((\ogn/na n ) e / 2 ) P{^(X) > A^/logn} 
= O (^- 1 - 9 ^ 1 - a \\ogn)^L 1 {n)- 6 A 

proving the lemma. □ 

It is now clear that max <j<2 n -i is stochastically bounded under 
condition (jl.2|) . and it remains to be shown that this is also the case for 

A ( 2 ) 

Let ot n be the empirical process based on the i.i.d sample Xx, . . . ,X n . 
Then we have for any measurable bounded function g : JR d — > IR, 

1 n 

For < j < l n — 1, consider the following class of functions defined by 
Sn,j := ^{t)K (j^pj : t e A 2 n>j , h n>j <h< h n>j+1 j , 



then obviously, 



where as usual ||\/^an||g . = su P g eg n I V^ a n(g)\- To show stochastic bound- 

(2) 

edness of A n 'j, we will use a standard technique for empirical processes, based 
on a useful exponential inequality of Talagrand [14], in combination with 
an appropriate upper bound of the moment quantity IE HX^ILi £ i#PQ)||g > 
where S\, . . . , e n are independent Rademacher random variables, independent 
of Xi, . . . , X n . 
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Lemma 2.5 For each j = 0, ...,/„ — 1, the class Q n j is a VC-class of func- 
tions with envelope function 



a ■ ■- Kf-0 



h \/3/2(l-/3) 



log h ntj+1 \ 

that satisfies the uniform entropy condition 

N{e,g n j)<Ce- v -\ 0<e<l, 

where C and v are positive constants (independent of n and j). 
Proof. Consider the classes 

Fnj = {m ■ teA 2 nJ } : 

Knj = (jtfj) '■ 1 e A n,P kn >i - k - > 

( nh \/ 3 /( 2 ( 1 -/ 3 ) 

with envelope functions F n j := e"^ ( -ji^T^^j J an d K respectively. 

Then Q n j C J- n jK, n j an d h follows from our assumptions on K that /C nj - 
is a VC-class of functions. Furthermore, it is easy to see that the covering 
number of J- n ,j-> which we consider as a class of constant functions, can be 
bounded above as follows : 

* (ey/QWj)' Fnj, d Q ) < C x e~\ < e < 1. 

Since K. n j is a VC-class, we have for some positive constants v and C2 < 00 
that 

N{eK,K n j,dQ)<C 2 e- v . 

Thus, the conditions of lemma Al in Einmahl and Mason [3] are satisfied, 
and we obtain the following uniform entropy bound for Q n j : 

Af(e,g nJ ) <Ce-^\ 0<e<l, 

proving the lemma. □ 
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Now, observe that for all t G A n ■ C A n and h n j < h < h n j + i, we have 



by condition (W.iii) for large n, 
X-t 



IE 



h i/d 



< 21E 

= 2 



i) 2 {x)f(x)K 2 ((x-t)/h}l d )dx. 



Recalling that ||V'/ /3 || 00 < 1> we see that this integral is bounded above by 



As the exponent (3/2(1 — f3) in the definition of G n j is strictly smaller than 
1/2, it is easily checked that by choosing the (3 in Proposition A.l of Einmahl 
and Mason [3] to be equal to G n j, and a n j = Cphn,j+i, there exists an no > 1 
so that the assumptions of Proposition A.l in Einmahl and Mason [3] are 
satisfied for all < j < l n — 1 and n > n . Therefore, we can conclude that 

n 

E ll ^2 e i9(X, i )\\g n:j < C ^/nh nJ log n, n > n , < j < l n - 1, 
i=i 

where C is a positive constant depending on a, (3, v and C only (where the 
(3 is again the one from condition {WD.i}). Moreover, as for < j < l n — 1 
we have | log/i nj | > | log 6 ri | ~ /xlogn, we see that for some ri\ > n , 



E|| Y^eigiXi)^. < C'Xnj, < j < l n - I. 



(2.12) 



i=i 



Recalling that A^- < W52™ = i£ig(Xi)\\g nj /\ n j it follows from Markov's in- 

(2) 

equality that the variables A n j are stochastically bounded for all < j < 
l n — 1. However, to prove that the maximum of these variables is stochas- 
tically bounded too, we need to use more sophisticated tools. One of them 
is the inequality of Talagrand [14] mentioned above. (For a suitable version, 
refer to Inequality A.l in [3].) Employing this inequality, we get that 



P { max 

Km<n 



>A 1 lmY / e i g(X i )\\ gn .+x 
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< 2 



exp 



A 2 x< 



na. 



+ exp 



A 2 x 

Gn,j 



where Ai,A 2 are universal constants. Next, recall that a^j = 2Cph n j and 
that G n> j < ce~PyJ nh nj j/\ log h n> j\, then choosing x = p\ n ,j (p > 1), we can 
conclude from the foregoing inequality and ()2.12|) that for large n, 



v{\\Vna n \\g nJ >A 1 (C , ' + p)\ thj } 



< 2 



exp 



A 2 p 2 M 



1 1 -j 



< 4 exp 



2Cp nh nJ 
A 2P 2 



exp -A 2 p 



A 



G 



II :l 



2C, 



log h n>j \ 



(2.13) 



where we used the fact that inf < :? <i n _ 1 X n j/(G n j \ \ogh n j\) oo as n / oo. 
Finally, since ||\/na„|L . > AnjAj, 2 '-, we just showed that 



';J—n,j 

ln-1 



P <^ max > M I < ^ P { || \fca n \\ Qn . > A„jM| < An' 2 , (2.14) 
provided we choose M > Ai(C" + y/bpCp) 'A 2 ) and n is large enough. It's 



now obvious that maxo<j<; n _i A„j- is stochastically bounded, which, in com- 
bination with ()2.9|) and the result in lemma l2~4l proves Theorem 1.1. □ 



3 Proof of Theorem 1.2 

In view of Lemma 12.21 it is sufficient to prove that under assumption (jl.4j) . 
we have with probability one that 

limsup A' n < M', 

for a suitable positive constant M' > 0. Recalling relation ()2.7|) . we only 
need to show that for suitable positive constants M[,M 2 , 

limsup max < M[, a.s, (3.1) 

n<i</ —i " 

18 



and 

limsup max A^] < ML a.s. (3.2) 

The result in ()3.2j) follows easily from (j2.14|) and the Borel-Cantelli lemma, 
and as is shown below, it turns out that (J3.1j) holds with M[ = 0, i.e this 
term goes to zero. Recall now from (J2.9)) that 



max $; ;. <C\K max , 

0<j<«n-l ,J ~ l<fc<n \(n) 

+ Cik max max ^-^-M njk I{X k e A* •}, 

0<j<in-l l<fc<n \ n j ,J 

where M n j yk = Y^7=i -^{l^j ~~ X k \ < — 1. From condition (jl.4|) and the 

assumption on a n we easily get that with probability one, ^(X/J/A^) — » 0, 
and consequently we also have that m&xi< k < n ip (X k ) / \(n) — > 0, finishing the 
study of the first term. To simplify notation, set 

Z n := max max %^- M njk I{X k G A 1 A, 
0<j<l n -ll<k<n \ nJ J ' n ' JJ 

take n k = 2 fc , k > 1, and set h' k j := h nk j and Zjj. := l nk+1 - Then note that 
max Z n < max max ^ ^ M' k u I{Xi gi[ j, 

n k <n<n k+1 0<j<l' k l<i<n k+1 A nk j 

where ^ = E^i J{l^ m - X\ < 2h' k f} - 1 and Af hJ = {t : < 
Cie^~^ ^/| bg/z4~]/n^/z4~}, and after some minor modifications, we obtain 
similarly to Lemma [2.41 that for e > 0, 



P < max Z n > e 

n k <n<n k+1 



0(l' k n- k v '), i/>0, 



which implies again via Borel-Cantelli that Z n — > almost surely, proving 
dSHD with M[ = 0. □ 
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